{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1> Demo. Data Access </h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook aims at documenting how to access & manipulate the input datasets for one \"ocean data challenge\".\n",
    "Two *tar.gz* archives with Sea Surface Height (SSH) datasets are available on the MEOM opendap server.\n",
    "The **dc_ref** dataset refers to the reference simulation, a.k.a NATL60-CMJ165 nature run carried out by the MEOM Team. The **dc_obs** corresponds to the observations datasets (for various altimeter missions) based on nadir (TOPEX/Poseidon, Jason1, Envisat, Geosat-2) and large swath (SWOT) orbits constructed with the [SWOTsimulator](https://github.com/SWOTsimulator/swotsimulator) package.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import xarray as xr\n",
    "import cftime\n",
    "import geoviews as gv\n",
    "import matplotlib.pylab as plt\n",
    "import numpy as np\n",
    "from datetime import datetime\n",
    "gv.extension('bokeh')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1) Download & extract **dc_ref** dataset (*Achtung: this may take several minutes !!!!!*)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://ige-meom-opendap.univ-grenoble-alpes.fr/thredds/fileServer/meomopendap/extract/ocean-data-challenges/dc_data1/dc_ref.tar.gz "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Decompress archive"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tar -xvf dc_ref.tar.gz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Open **dc_ref** dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dc_ref = xr.open_mfdataset('./dc_ref/*.nc', combine='nested', concat_dim='time')\n",
    "dc_ref"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Extract daily mean 1 month sample"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "time_min = '20130901'\n",
    "time_max = '20130930'\n",
    "dc_ref_sample = dc_ref.sel(time=slice(time_min, time_max)).resample(time=\"1D\").mean()\n",
    "dc_ref_sample"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Display dc_ref_sample sea surface heigh (variable sossheig) dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = gv.Dataset(dc_ref_sample, ['lon', 'lat', 'time'], 'sossheig')\n",
    "images = dataset.to(gv.Image)\n",
    "images.opts(cmap='gist_stern', colorbar=True, width=500, height=400)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2) Download & extract **dc_obs** dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://ige-meom-opendap.univ-grenoble-alpes.fr/thredds/fileServer/meomopendap/extract/ocean-data-challenges/dc_data1/dc_obs.tar.gz "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tar -xvf dc_obs.tar.gz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Ref start date of NATL60 simulation\n",
    "simu_start_date = '2012-10-01'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Read nadir (Envisat Theoritical track)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preprocess_time_en(ds):\n",
    "    # This preprocessing must be avoided in future dataset release\n",
    "    time_shift_en = 22.10114 # days \n",
    "    ds['time'] = cftime.num2date(ds['time'].values - time_shift_en * 24 * 3600, 'seconds since ' + simu_start_date)\n",
    "    # Change 0 as nan for the first and last cycles\n",
    "    ds['ssh_obs'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_obs'].values).filled(np.nan)\n",
    "    ds['ssh_model'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_model'].values).filled(np.nan)\n",
    "    return ds\n",
    "\n",
    "ds_en_nadir = xr.open_mfdataset('./dc_obs/en/*.nc', combine='nested', concat_dim='time', preprocess=preprocess_time_en)\n",
    "ds_en_nadir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Read nadir (Geosat-2 Theoritical track)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preprocess_time_g2(ds):\n",
    "    # This preprocessing must be avoided in future dataset release\n",
    "    time_shift_g2 = 15.08489 # days\n",
    "    ds['time'] = cftime.num2date(ds['time'].values - time_shift_g2 * 24 * 3600, 'seconds since ' + simu_start_date)\n",
    "    # Change 0 as nan for the first and last cycles\n",
    "    ds['ssh_obs'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_obs'].values).filled(np.nan)\n",
    "    ds['ssh_model'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_model'].values).filled(np.nan)\n",
    "    return ds\n",
    "\n",
    "ds_g2_nadir = xr.open_mfdataset('./dc_obs/g2/*.nc', combine='nested', concat_dim='time', preprocess=preprocess_time_g2)\n",
    "ds_g2_nadir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Read nadir (Jason-1 Theoritical track)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preprocess_time_j1(ds):\n",
    "    # This preprocessing must be avoided in future dataset release\n",
    "    time_shift_j1 = 3.736615 # days\n",
    "    ds['time'] = cftime.num2date(ds['time'].values - time_shift_j1 * 24 * 3600, 'seconds since ' + simu_start_date)\n",
    "    # Change 0 as nan for the first and last cycles\n",
    "    ds['ssh_obs'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_obs'].values).filled(np.nan)\n",
    "    ds['ssh_model'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_model'].values).filled(np.nan)\n",
    "    return ds\n",
    "\n",
    "ds_j1_nadir = xr.open_mfdataset('./dc_obs/j1/*.nc', combine='nested', concat_dim='time', preprocess=preprocess_time_j1)\n",
    "ds_j1_nadir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Read nadir (Tpn Theoritical track)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preprocess_time_tpn(ds):\n",
    "    # This preprocessing must be avoided in future dataset release\n",
    "    time_shift_tpn = 3.731883 # days\n",
    "    ds['time'] = cftime.num2date(ds['time'].values - time_shift_tpn * 24 * 3600, 'seconds since ' + simu_start_date)\n",
    "    # Change 0 as nan for the first and last cycles\n",
    "    ds['ssh_obs'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_obs'].values).filled(np.nan)\n",
    "    ds['ssh_model'].values =  np.ma.masked_where(ds['ssh_model'].values == 0., ds['ssh_model'].values).filled(np.nan)  \n",
    "    return ds\n",
    "\n",
    "ds_tpn_nadir = xr.open_mfdataset('./dc_obs/tpn/*.nc', combine='nested', concat_dim='time', preprocess=preprocess_time_tpn)\n",
    "ds_tpn_nadir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Read SWOT nadir (Science orbit)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preprocess_time_swot_nadir(ds):\n",
    "    # This preprocessing must be avoided in future dataset release\n",
    "    ds['time'] = cftime.num2date(ds['time'].values, 'seconds since ' + simu_start_date)\n",
    "    return ds\n",
    "\n",
    "ds_swot_nadir = xr.open_mfdataset('./dc_obs/swot/BOOST-SWOT_SWOT_nadir_GULFSTREAM_*.nc', combine='nested', concat_dim='time', preprocess=preprocess_time_swot_nadir)\n",
    "ds_swot_nadir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Read SWOT swath (Science orbit)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preprocess_time_swot_karin(ds):\n",
    "    # This preprocessing must be avoided in future dataset release\n",
    "    ds['time'] = cftime.num2date(ds['time'].values, 'seconds since ' + simu_start_date)\n",
    "    # Duplicate time over across track dimension for swath\n",
    "    ds = ds.stack(z=('nC', 'time'))\n",
    "    return ds\n",
    "\n",
    "ds_swot_karin = xr.open_mfdataset('./dc_obs/swot/BOOST-SWOT_SWOT_GULFSTREAM_c*.nc', preprocess=preprocess_time_swot_karin, combine='nested', concat_dim='z')\n",
    "ds_swot_karin"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Make example of nadir and large swath SSH observations, reference SSH fields "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "selection_start_date = '2012-10-01'\n",
    "selection_end_date = '2012-10-10'\n",
    "central_date = '2012-10-05'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds_en_nadir_sel = ds_en_nadir.sel(time=slice(selection_start_date, selection_end_date))\n",
    "ds_g2_nadir_sel = ds_g2_nadir.sel(time=slice(selection_start_date, selection_end_date))\n",
    "ds_tpn_nadir_sel = ds_tpn_nadir.sel(time=slice(selection_start_date, selection_end_date))\n",
    "ds_j1_nadir_sel = ds_j1_nadir.sel(time=slice(selection_start_date, selection_end_date))\n",
    "ds_swot_nadir_sel = ds_swot_nadir.sel(time=slice(selection_start_date, selection_end_date))\n",
    "ds_swot_karin_sel = ds_swot_karin.sel(time=slice(datetime.strptime(selection_start_date, '%Y-%m-%d'), datetime.strptime(selection_end_date,  '%Y-%m-%d' )))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(15, 5))\n",
    "plt.subplot(131)\n",
    "plt.scatter(ds_en_nadir_sel.lon, ds_en_nadir_sel.lat, c=ds_en_nadir_sel.ssh_model, s=20, cmap = 'gist_stern')\n",
    "plt.scatter(ds_j1_nadir_sel.lon, ds_j1_nadir_sel.lat, c=ds_j1_nadir_sel.ssh_model, s=20, cmap = 'gist_stern')\n",
    "plt.scatter(ds_g2_nadir_sel.lon, ds_g2_nadir_sel.lat, c=ds_g2_nadir_sel.ssh_model, s=20, cmap = 'gist_stern')\n",
    "plt.scatter(ds_tpn_nadir_sel.lon, ds_tpn_nadir_sel.lat, c=ds_tpn_nadir_sel.ssh_model, s=20, cmap = 'gist_stern')\n",
    "plt.scatter(ds_swot_nadir_sel.lon, ds_swot_nadir_sel.lat, c=ds_swot_nadir_sel.ssh_model, s=20, cmap = 'gist_stern')\n",
    "plt.xlabel('longitude', fontweight='bold')\n",
    "plt.ylabel('latitude', fontweight='bold')\n",
    "plt.title(f'SSH model @ nadirs')\n",
    "plt.colorbar(orientation='horizontal')\n",
    "plt.subplot(132)\n",
    "plt.scatter(ds_swot_karin_sel.lon, ds_swot_karin_sel.lat, c=ds_swot_karin_sel.ssh_model, s=10, cmap = 'gist_stern')\n",
    "plt.xlabel('longitude', fontweight='bold')\n",
    "plt.ylabel('latitude', fontweight='bold')\n",
    "plt.title(f'SSH model @ SWOT swath')\n",
    "plt.colorbar(orientation='horizontal')\n",
    "plt.subplot(133)\n",
    "plt.pcolormesh(dc_ref.lon%360, dc_ref.lat, dc_ref.sossheig.sel(time=central_date).mean(dim='time'), cmap = 'gist_stern')\n",
    "plt.xlabel('longitude', fontweight='bold')\n",
    "plt.ylabel('latitude', fontweight='bold')\n",
    "plt.title(f'SSH model')\n",
    "plt.colorbar(orientation='horizontal')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
